The Performance Value of Shared Network Caches in Clustered Multiprocessor Workstations
نویسندگان
چکیده
This paper evaluates the bene t of adding a shared cache to the network interface as a means of improving the performance of networked workstations con gured as a distributed shared memory multi processor A cache on the network interface shared by all processors on each cluster o ers the potential bene ts of retaining evicted processor cache lines providing implicit prefetching when network cache lines are longer than processor cache lines and increasing intra cluster sharing Using simulation the performance of eight parallel scienti c applications was evaluated In each case we examined in detail the means by which processor cache misses were satis ed Our results were mixed For the applications studied we found that the network cache o ers substan tial performance bene t when processor caches are too small to hold the application s primary working set or when network contention limits application performance The expected bene ts of implicit prefetching and increased intra cluster sharing did not contribute signi cantly to the performance enhancement of the network cache for most applications Finally the advantage a orded by the network cache diminishes as processor cache size increases and network contention decreases
منابع مشابه
The Effectiveness of SRAM Network Caches in Clustered DSMs
The frequency of accesses to remote data is a key factor affecting the performance of all Distributed Shared Memory (DSM) systems. Remote data caching is one of the most effective and general techniques to fight processor stalls due to remote capacity misses in the processor caches. The design space of remote data caches (RDC) has many dimensions and one essential performance trade-off: hit rat...
متن کاملA Scaleable Multiprocessor Architecture with Multiple Read-Write Memory Model
This paper presents a scalable multiprocessor architecture with multiple access memories and multi-way busses. This parallel architecture with more intelligent memory model and efficient multi-way interconnection network organization is called as CRrCW (Concurrent Read and restricted Concurrent Write) scaleable multiprocessor system. The memory and network model provides concurrent memory acces...
متن کاملComparison of the Performance of Two Service Disciplines for a Shared Bus Multiprocessor with Private Caches
In this paper, we compare two analytical models for evaluation of cache coherence overhead of a shared bus multiprocessor with private caches. The models are based on a closed queuing network with different service disciplines. We find that the priority discipline can be used as a lower-level bound. Some numerical results are shown graphically.
متن کاملImbedded Markov Chains Model of Multiprocessor with Shared Memory
This paper addresses the problem of evaluating the performance of multiprocessor with shared memory and private caches executing Invalidate Coherence Protocols. The model is grounded in queuing network theory and includes bus interference, cache interference, and main memory interference. The method of the Imbedded Markov Chains is used. The highest and lowest performance characteristics are ca...
متن کاملScalable Inter-Cluster Communication Systems for Clustered Multiprocessors
As workstation clusters move away from uniprocessors in favor of multiprocessors to support the increasing computational needs of distributed applications, greater demands are placed on the communication interfaces that couple individual workstations. This paper investigates scalable, e cient, and reliable communication systems for multiprocessor clusters that use commodity local area networks ...
متن کامل